我们提出了一种新型自然语言文本的嵌入,该文本深深代表语义含义。标准文本嵌入使用验证语言模型的矢量输出。在我们的方法中,我们让语言模型从文本中学习,然后从字面上挑选其大脑,从而将模型神经元的实际权重产生向量。我们称此文本表示为神经嵌入。该技术可能会超越文本和语言模型,但我们首先探索其自然语言处理的属性。我们将神经嵌入与几个数据集上的GPT句子(SGPT)嵌入进行比较。我们观察到,神经嵌入使用较小的模型实现可比的性能,并且错误是不同的。
translated by 谷歌翻译
事实一致性是重要的总结评估尺寸之一,特别是摘要一代变得更流畅和连贯。最近专门用于事实一致性的estime措施与人类专家评分的高相关性,以便在原则上仅限于评估具有高词典重叠的这些文本概要对。这不是当前摘要风格的问题,但它可能成为未来摘要系统的障碍,或者用于对文本评估任意索赔。在这项工作中,我们概括了该方法,使其适用于任何文本摘要对。由于estime使用上下文相似点,它提供了对不同伯特层所采取的信息的有用性提供了见解。我们观察到除了几个最低的几乎所有层中存在有用的信息。对于一致性和流利 - 众所周知的素质 - 最有用的层靠近顶部(但不是顶部);对于一致性和相关性,我们发现了一个更复杂和有趣的画面。
translated by 谷歌翻译
我们呈现了名字,一个从英语维基百科和新闻文章中获得的暧昧名称的实体的数据集。它由4148个独特实体的58862提到和他们的名称:来自News的1000个提到,来自Wikipedia关于实体的文章28843,以及29019维基百科反向链接提到。名称应该有助于为命名实体链接的任务建立具有挑战性的基准(NEL)。
translated by 谷歌翻译
质量摘要数据集的创建是一种昂贵,耗时的努力,需要通过训练有素的人类和机器的摘要生产和评估。如果这种努力用一种语言制作,则能够在不重复人类注释的情况下用其他语言使用它是有益的。要调查我们可以信任此类数据集的机器翻译多少,我们将英文数据集汇总到七种语言,并在自动评估措施中比较性能。我们将等同性测试探讨为适当的统计范式,用于评估摘要人类自动评分之间的相关性。虽然我们发现类似于源类似语言的数据集重用潜力,但在横跨翻译中找不到大多数摘要评估方法。
translated by 谷歌翻译
摘要的目标是简明地说明文件中最重要的信息。在这一原则上,我们介绍了新的参考摘要评估指标,该评估指标使用预训练的语言模型来估计文档与其摘要之间共享的信息内容。这些指标是在香农游戏的现代化,这是几十年前提出的摘要质量评分的方法,在那里我们用语言模型替换人类的注释器。我们还将这些指标视为Blanc的扩展,最近提出的摘要质量测量方法,基于语言模型的性能,而无需摘要。采用基于变压器的语言模型,我们经验验证了我们的指标与人类判断的最先进的相关性,与既有一致性和相关性的摘要质量尺寸,以及与人为判断的一致性和流畅性的竞争相关性。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
In this paper we derive a PAC-Bayesian-Like error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian-Like error bound for such systems, 3) discuss various consequences of this error bound.
translated by 谷歌翻译
We demonstrate how efficient autonomous drone swarms can be in detecting and tracking occluded targets in densely forested areas, such as lost people during search and rescue missions. Exploration and optimization of local viewing conditions, such as occlusion density and target view obliqueness, provide much faster and much more reliable results than previous, blind sampling strategies that are based on pre-defined waypoints. An adapted real-time particle swarm optimization and a new objective function are presented that are able to deal with dynamic and highly random through-foliage conditions. Synthetic aperture sensing is our fundamental sampling principle, and drone swarms are employed to approximate the optical signals of extremely wide and adaptable airborne lenses.
translated by 谷歌翻译